223 research outputs found

    The statistical analysis of multi-environment data: modeling genotype-by-environment interaction and its genetic basis

    Get PDF
    Genotype-by-environment interaction (GEI) is an important phenomenon in plant breeding. This paper presents a series of models for describing, exploring, understanding, and predicting GEI. All models depart from a two-way table of genotype by environment means. First, a series of descriptive and explorative models/approaches are presented: Finlay–Wilkinson model, AMMI model, GGE biplot. All of these approaches have in common that they merely try to group genotypes and environments and do not use other information than the two-way table of means. Next, factorial regression is introduced as an approach to explicitly introduce genotypic and environmental covariates for describing and explaining GEI. Finally, QTL modeling is presented as a natural extension of factorial regression, where marker information is translated into genetic predictors. Tests for regression coefficients corresponding to these genetic predictors are tests for main effect QTL expression and QTL by environment interaction (QEI). QTL models for which QEI depends on environmental covariables form an interesting model class for predicting GEI for new genotypes and new environments. For realistic modeling of genotypic differences across multiple environments, sophisticated mixed models are necessary to allow for heterogeneity of genetic variances and correlations across environments. The use and interpretation of all models is illustrated by an example data set from the CIMMYT maize breeding program, containing environments differing in drought and nitrogen stress. To help readers to carry out the statistical analyses, GenStat® programs, 15th Edition and Discovery® version, are presented as “Appendix.

    Accurate genotype imputation in multiparental populations from low-coverage sequence

    Get PDF
    Many different types of multiparental populations have recently been produced to increase genetic diversity and resolution in QTL mapping. Low-coverage, genotyping-by-sequencing (GBS) technology has become a cost-effective tool in these populations, despite large amounts of missing data in offspring and founders. In this work, we present a general statistical framework for genotype imputation in such experimental crosses from low-coverage GBS data. Generalizing a previously developed hidden Markov model for calculating ancestral origins of offspring DNA, we present an imputation algorithm that does not require parental data and that is applicable to bi-and multiparental populations. Our imputation algorithm allows heterozygosity of parents and offspring as well as error correction in observed genotypes. Further, our approach can combine imputation and genotype calling from sequencing reads, and it also applies to called genotypes from SNP array data. We evaluate our imputation algorithm by simulated and real data sets in four different types of populations: the F2, the advanced intercross recombinant inbred lines, the multiparent advanced generation intercross, and the cross-pollinated population. Because our approach uses marker data and population design information efficiently, the comparisons with previous approaches show that our imputation is accurate at even very low (< 1 ×) sequencing depth, in addition to having accurate genotype phasing and error detection.</p

    Genome-wide screening for cis-regulatory variation using a classical diallel crossing scheme

    Get PDF
    Large-scale screening studies carried out to date for genetic variants that affect gene regulation are generally limited to descriptions of differences in allele-specific expression (ASE) detected in vivo. Allele-specific differences in gene expression provide evidence for a model whereby cis-acting genetic variation results in differential expression between alleles. Such gene surveys for regulatory variation are a first step in identifying the specific nucleotide changes that govern gene expression differences, but they leave the underlying mechanisms unexplored. Here, we propose a quantitative genetics approach to perform a genome-wide analysis of ASE differences (GASED). The GASED approach is based on a diallel design that is often used in plant breeding programs to estimate general combining abilities (GCA) of specific inbred lines and to identify high-yielding hybrid combinations of parents based on their specific combining abilities (SCAs). In a context of gene expression, the values of GCA and SCA parameters allow cis- and trans-regulatory changes to be distinguished and imbalances in gene expression to be ascribed to cis-regulatory variation. With this approach, a total of 715 genes could be identified that are likely to carry allelic polymorphisms responsible for at least a 1.5-fold allelic expression difference in a total of 10 diploid Arabidopsis thaliana hybrids. The major strength of the GASED approach, compared to other ASE detection methods, is that it is not restricted to genes with allelic transcript variants. Although a false-positive rate of 9/41 was observed, the GASED approach is a valuable pre-screening method that can accelerate systematic surveys of naturally occurring cis-regulatory variation among inbred lines for laboratory species, such as Arabidopsis, mouse, rat and fruitfly, and economically important crop species, such as corn

    Genomic prediction of grain yield and drought-adaptation capacity in sorghum is enhanced by multi-trait analysis

    Get PDF
    Grain yield and stay-green drought adaptation trait are important targets of selection in grain sorghum breeding for broad adaptation to a range of environments. Genomic prediction for these traits may be enhanced by joint multi-trait analysis. The objectives of this study were to assess the capacity of multi-trait models to improve genomic prediction of parental breeding values for grain yield and stay-green in sorghum by using information from correlated auxiliary traits, and to determine the combinations of traits that optimize predictive results in specific scenarios. The dataset included phenotypic performance of 2645 testcross hybrids across 26 environments as well as genomic and pedigree information on their female parental lines. The traits considered were grain yield (GY), stay-green (SG), plant height (PH), and flowering time (FT). We evaluated the improvement in predictive performance of multi-trait G-BLUP models relative to single-trait G-BLUP. The use of a blended kinship matrix exploiting pedigree and genomic information was also explored to optimize multi-trait predictions. Predictive ability for GY increased up to 16% when PH information on the training population was exploited through multi-trait genomic analysis. For SG prediction, full advantage from multi-trait G-BLUP was obtained only when GY information was also available on the predicted lines per se, with predictive ability improvements of up to 19%. Predictive ability, unbiasedness and accuracy of predictions from conventional multi-trait G-BLUP were further optimized by using a combined pedigree-genomic relationship matrix. Results of this study suggest that multi-trait genomic evaluation combining routinely measured traits may be used to improve prediction of crop productivity and drought adaptability in grain sorghum.EEA PergaminoFil: Velazco, Julio. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria Pergamino. Sección Forrajeras; Argentina. Wageningen University and Research . Biometris – Mathematical and Statistical Methods; HolandaFil: Jordan, David R. The University of Queensland. Hermitage Research Facility. Queensland Alliance for Agriculture and Food Innovation; AustraliaFil: Mace, Emma S. The University of Queensland. Hermitage Research Facility. Queensland Alliance for Agriculture and Food Innovation; Australia. Hermitage Research Facility. Department of Agriculture and Fisheries; AustraliaFil: Hunt, Colleen H. The University of Queensland. Hermitage Research Facility. Queensland Alliance for Agriculture and Food Innovation; Australia. Hermitage Research Facility. Department of Agriculture and Fisheries; AustraliaFil: Malosetti, Marcos. Wageningen University and Research . Biometris – Mathematical and Statistical Methods; HolandaFil: Eeuwijk, Fred A. van. Wageningen University and Research . Biometris – Mathematical and Statistical Methods; Holand

    Southeast of What? Reflections on SEALS\u27 Success

    Get PDF
    In epidemiologic studies, measurement error in dietary variables often attenuates association between dietary intake and disease occurrence. To adjust for the attenuation caused by error in dietary intake, regression calibration is commonly used. To apply regression calibration, unbiased reference measurements are required. Short-term reference measurements for foods that are not consumed daily contain excess zeroes that pose challenges in the calibration model. We adapted two-part regression calibration model, initially developed for multiple replicates of reference measurements per individual to a single-replicate setting. We showed how to handle excess zero reference measurements by two-step modeling approach, how to explore heteroscedasticity in the consumed amount with variance-mean graph, how to explore nonlinearity with the generalized additive modeling (GAM) and the empirical logit approaches, and how to select covariates in the calibration model. The performance of two-part calibration model was compared with the one-part counterpart. We used vegetable intake and mortality data from European Prospective Investigation on Cancer and Nutrition (EPIC) study. In the EPIC, reference measurements were taken with 24-hour recalls. For each of the three vegetable subgroups assessed separately, correcting for error with an appropriately specified two-part calibration model resulted in about three fold increase in the strength of association with all-cause mortality, as measured by the log hazard ratio. Further found is that the standard way of including covariates in the calibration model can lead to over fitting the two-part calibration model. Moreover, the extent of adjusting for error is influenced by the number and forms of covariates in the calibration model. For episodically consumed foods, we advise researchers to pay special attention to response distribution, nonlinearity, and covariate inclusion in specifying the calibration model

    Reconstruction of Networks with Direct and Indirect Genetic Effects

    Get PDF
    Genetic variance of a phenotypic trait can originate from direct genetic effects, or from indirect effects, i.e., through genetic effects on other traits, affecting the trait of interest. This distinction is often of great importance, for example, when trying to improve crop yield and simultaneously control plant height. As suggested by Sewall Wright, assessing contributions of direct and indirect effects requires knowledge of (1) the presence or absence of direct genetic effects on each trait, and (2) the functional relationships between the traits. Because experimental validation of such relationships is often unfeasible, it is increasingly common to reconstruct them using causal inference methods. However, most current methods require all genetic variance to be explained by a small number of quantitative trait loci (QTL) with fixed effects. Only a few authors have considered the “missing heritability” case, where contributions of many undetectable QTL are modeled with random effects. Usually, these are treated as nuisance terms that need to be eliminated by taking residuals from a multi-trait mixed model (MTM). But fitting such an MTM is challenging, and it is impossible to infer the presence of direct genetic effects. Here, we propose an alternative strategy, where genetic effects are formally included in the graph. This has important advantages: (1) genetic effects can be directly incorporated in causal inference, implemented via our PCgen algorithm, which can analyze many more traits; and (2) we can test the existence of direct genetic effects, and improve the orientation of edges between traits. Finally, we show that reconstruction is much more accurate if individual plant or plot data are used, instead of genotypic means. We have implemented the PCgen-algorithm in the R-package pcgen.</p

    Determinants of barley grain yield in drought-prone Mediterranean environments

    Get PDF
    The determinants of barley grain yield in drought-prone Mediterranean environments have been studied in the Nure x Tremois (NT) population. A large set of yield and other morpho-physiological data were recorded in 118 doubled haploid (DH) lines of the population, in multi-environment field trials (18 site-year combination). Agrometeorological variables have been recorded and calculated at each site too. Four main periods of barley development were considered, vegetative, reproductive early and late grain filling phases, to dissect the effect on yield traits of the growth phases. Relationships between agrometeorological variables, grain yield (GY) and its main components (GN and GW) were also investigated by correlation. Results firstly gave a clear indication of the involvement of water consumption in determining GY and GW (r2=0.616, P=0.007 and r2=0.703, P=0.005, respectively) calculated from sowing to the early grain filling period, while GN showed its highest correlation with the total photothermal quotient (PQ) calculated for the same period (r2=0.646, P=0.013). With the only exception of total PQ calculated during the vegetative period, all significant correlations with GY were associated to water-dependent agrometeorological parameters. As a second result, the NT segregating population allowed us to weight the amount of interaction due to genotypes over environments or to environments in relation to genotypes by a GGE analysis; 47.67% of G+GE sum of squares was explained by the first two principal components. Then, the introduction of genomic information at major barley genes regulating the length of growth cycle allowed us to explain patterns of adaptation of different groups of NT lines according to the variants (alleles) harbored at venalization (Vrn-H1) in combination with earliness (Eam6) genes. The superiority of the lines carrying the Nure allele at Eam6 was confirmed by factorial ANOVA testing the four possible haplotypes obtained combining alternative alleles at Eam6 and Vrn-H1. Maximum yield potential and differentials among the NT genotypes was finally explored through Finlay-Wilkinson model to interpret grain yield of NT genotypes together with yield adaptability (Ya), as the regression coefficient bi; Ya ranged from 0.71 for NT77 to 1.20 for NT19. Lines simply harboring the Nure variants at the two genes behaved as highest yielding (3.04 t ha\u20131), and showed the highest yield adaptability (bi=1.05). The present study constitutes a starting point towards the introduction of genomic variables in agronomic models for barley grain yield in Mediterranean environments

    Common bean SNP alleles and candidate genes affecting photosynthesis under contrasting water regimes

    Get PDF
    Water deficit is a major worldwide constraint to common bean (Phaseolus vulgaris L.) production, being photosynthesis one of the most affected physiological processes. To gain insights into the genetic basis of the photosynthetic response of common bean under water-limited conditions, a collection of 158 Portuguese accessions was grown under both well-watered and water-deficit regimes. Leaf gas-exchange parameters were measured and photosynthetic pigments quantified. The same collection was genotyped using SNP arrays, and SNP-trait associations tested considering a linear mixed model accounting for the genetic relatedness among accessions. A total of 133 SNPtrait associations were identified for net CO2 assimilation rate, transpiration rate, stomatal conductance, and chlorophylls a and b, carotenes, and xanthophyll contents. Ninety of these associations were detected under waterdeficit and 43 under well-watered conditions, with only two associations common to both treatments. Identified candidate genes revealed that stomatal regulation, protein translocation across membranes, redox mechanisms, hormone, and osmotic stress signaling were the most relevant processes involved in common bean response to water-limited conditions. These candidates are now preferential targets for common bean water-deficit-tolerance breeding. Additionally, new sources of water-deficit tolerance of Andean, Mesoamerican, and admixed origin were detected as accessions valuable for breeding, and not yet exploredinfo:eu-repo/semantics/publishedVersio
    corecore